Human feelings play a crucial function in powerful verbal exchange and choice-making. In the technology of artificial intelligence (AI) and human-computer interplay (HCI), enabling machines to apprehend, discover, and reply to human emotions has come to be increasingly essential. This research offers a real-time Facial Emotion Recognition (FER) system that detects and classifies seven key human feelings Angry, Disgust, Fear, Happy, Sad, Surprise, and Neutral from live video streams using a Convolutional Neural Network (CNN) model.
The model changed into skilled at the FER2013 dataset, using facts augmentation, elegance balancing, and dropout techniques to improve accuracy. The machine uses OpenCV for real-time face detection, TensorFlow/Keras for CNN model education, and a Flask-based totally web application for an interactive person interface. The proposed technique demonstrates high actual-time overall performance, achieving as much as eighty five% validation accuracy, and offers a couple of actual-global packages, along with mental health monitoring, e-mastering, customer service enhancement, and emotion-aware AI assistants.
Introduction
Human emotions are essential to communication and decision-making. With advances in Artificial Intelligence (AI) and Machine Learning (ML), systems can now recognize and respond to human emotions, leading to the growth of affective computing. A key technology in this field is Facial Emotion Recognition (FER), which interprets facial expressions—responsible for over 55% of human communication—to identify emotions.
This project presents a real-time FER system that uses Convolutional Neural Networks (CNNs) for automatic feature extraction from facial images, eliminating the need for manual feature engineering. The system classifies seven primary emotions: Angry, Disgust, Fear, Happy, Sad, Surprise, and Neutral. It is trained on the FER2013 dataset (35,000+ grayscale images) to improve accuracy and robustness.
System Features:
Real-time emotion detection using webcam input.
Built with OpenCV for face detection and Flask for a web interface.
Uses Haar Cascade classifiers for face detection.
Employs preprocessing (grayscale conversion, resizing, normalization) before emotion classification.
Displays predicted emotions with confidence scores.
Applications:
Healthcare: Assist therapists in remote emotional assessments.
Education: Monitor student engagement in virtual classes.
Customer Service: Analyze user satisfaction.
Gaming & VR: Adapt gameplay based on user emotions.
Literature Review Highlights:
Early methods (e.g., LBP + SVM) needed manual feature extraction and struggled with real-world variability.
Deep learning (CNNs) has improved accuracy and robustness, especially with large datasets like FER2013.
Real-time and lightweight models (e.g., optimized CNNs) are now enabling deployment on devices like smartphones and IoT systems.
Methodology:
Existing Methods:
Depend on handcrafted features (e.g., LBP, HOG).
Use classical classifiers (e.g., SVM, KNN).
Often process static images and lack real-time capabilities.
Proposed Method:
CNN-based model for automatic, deep feature extraction.
Real-time video processing using OpenCV.
Data augmentation and regularization techniques (e.g., dropout, learning rate tuning) improve generalization and robustness.
System Architecture Workflow:
User launches the Flask web app.
On clicking “Start Detection,” the webcam activates.
OpenCV detects faces in real time.
Detected faces are preprocessed (cropped, grayscaled, resized).
The trained CNN predicts the emotion and provides a confidence score.
Results are displayed on the web interface.
User can stop detection anytime.
Conclusion
This research introduces a real -time emotional recognition system that combines Deep Learning (CNN) with Computer Vision (OpenCV) and is a flask -based network interface to classify seven feelings with high accuracy. Fer2013 trained models on datasets, automatically removes facial properties and predictions of real time.
References
[1] P. Ekman and W. V. Friesen, “Facial Action Coding System,” Consulting Psychologists Press, 1978
[2] S. Li and W. Deng, “Deep Facial Expression Recognition: A Survey,” IEEE Transactions on Affective Computing, vol. 13, no. Three, pp. 1195–1215, 2022.
[3] I. Goodfellow et al., “Challenges in Representation Learning: A Report on Facial Emotion Recognition,” Advances in Neural Information Processing Systems, 2013.
[4] K. Zhang, Z. Zhang, Z. Li, and Y. Qiao, “Joint Face Detection and Alignment Using Multi-Task Cascaded CNN,” IEEE Signal Processing Letters, vol. 23, no. 10, pp. 1499–1503, 2016.
[5] A. Mollahosseini, D. Chan, and M. H. Mahoor, “AffectNet: A Database for Facial Expression, Valence, and Emotion Recognition,” IEEE Conference on Computer Vision and Pattern Recognition, 2017.
[6] K. Simonyan and A. Zisserman, “Very Deep Convolutional Networks for Large-Scale Image Recognition,” arXiv preprint arXiv:1409.1556, 2014.
[7] F. Schroff, D. Kalenichenko, and J. Philbin, “FaceNet: A Unified Embedding for Face Recognition and Clustering,” IEEE CVPR, 2015
[8] S. Minaee et al., “Deep Emotion Recognition from Facial Expressions: A Survey,” IEEE Access, vol. 8, pp. 48763–48783, 2020.
[9] T. Ojala, M. Pietikäinen, and D. Harwood, “A Comparative Study of Texture Measures with Classification Based on Featured Distributions,” Pattern Recognition, vol. 29, no. 1, pp. Fifty one–59, 1996.
[10] M. Koelstra et al., “DEAP: A Database for Emotion Analysis Using Physiological Signals,” IEEE Transactions on Affective Computing, vol. Three, no. 1, pp. 18–31, 2012.